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Applying to college is a rigorous process based on a reciprocal 
relationship between students and institutions, as both actively 
search to meet their own needs and aspirations for the best edu- 
cation and student body, respectively. The popular media often 
depict college admission as characterized by two extreme groups 
of applicants: those who are academically gifted, apply to many 
competitive institutions, and spend exorbitant amounts of money 
on college preparation (Rubin, 2008) and those who are under- 
prepared and unaware of the college admission process (Nyhan, 
2006). Although these two groups do exist, they are far from the 
norm of college applicants. They may be better exemplified as at 
least a few groups of students who can be classified on a variety of 
characteristics. As such, there arises a need to identify and describe 
these unique clusters of college applicants in order to guide their 
academic preparation for college success as well as to rethink col- 
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The media often communicates the existence of two distinct types of 
college applicants: the frenzied, overachieving, anxious student who 
applies to many institutions and the underprepared, less advantaged 
student who is not at all familiar with the application process. Although 
these two groups likely do exist, they are far from the norm of col- 
lege applicants who are better exemplified as at least a few groups of 
students who can be classified based on relevant characteristics. We 
identified five unique clusters of students: Privileged High Achievers/ 
Athletes, Disadvantaged Students, Average Students Needing More 
Guidance, Mostly Female Academics, and Privileged Low Achievers. 
These clusters differed from each other based on variables includ- 
ing: academic performance, demographic characteristics, home and 
school characteristics, participation in school activities, and the number 
and types of higher education institutions to which they apply. An under- 
standing of these descriptive clusters, comprised of students with similar 
backgrounds and goals for higher education, is a necessary first step in 
developing more thoughtful and inclusive enrollment management and 
college preparation practices. 


Shaw, E. J., Kobrin,]. L., Packman, S. F., & Schmidt, A. E. (2009). Describing students 
involved in the search phase of the college choice process: A cluster analysis study. Journal 
of Advanced Academics, 20, 662-700. 
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lege recruitment and admission policies in order to maximize the 
benefits of higher education for both students and institutions. 

Admission directors, institutional researchers, enrollment 
managers, and guidance counselors around the country are exam- 
ining questions related to the changing nature of college admis- 
sion in the 21st century There have been demographic shifts 
by ethnicity in high school and college attendance (Western 
Interstate Commission for Higher Education, 2008), techno- 
logical advances have improved the dissemination of informa- 
tion about colleges and universities (MacAllum, Glover, Queen, 
&c Riggs, 2007), rankings are playing a larger role in driving 
postsecondary educational goals (Thacker, 2008), debates about 
merit- versus need-based aid remain heated (Marklein, 2007; 
Wilkinson, 2005), and the admission profession is discussing 
the need to redefine its purpose (College Board’s Task Force on 
Admissions in the 21st Century, 2008). In an effort to understand 
where college admission is headed, it would seem important to 
reexamine who the college applicants are in the U.S. and how 
they can best be understood within a framework of important 
characteristics related to college choice and admission decisions. 
Ultimately, this can lead to targeted, innovative efforts to assist 
students in preparing for and successfully completing college. 


Related Literature 

Both student and institutional characteristics form the basis 
for decision-making about where students apply to college and 
how college admission officers recruit students. The major theory 
of college choice avers that selecting a postsecondary institution 
is a dynamic process with three distinct phases: predisposition, 
search, and choice (Hossler 8c Gallagher, 1987). During the pre- 
disposition phase, an individual’s aspiration to attend higher edu- 
cation is explored, which is influenced by the individual’s gender 
and ethnicity; socioeconomic status (SES); academic achieve- 
ment and ability; parental education and expectations; peer sup- 
port and peer college choice; high school counselor and teacher 
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support; career plans; involvement in extracurricular activities; 
location of family residence; and high school quality and curric- 
ulum track. Once college aspirations are formulated, the search 
phase, in which students locate institutions that meet their most 
important criteria, begins. After applications are sent and deci- 
sion letters are received, the final phase of choice occurs and the 
process is completed with enrollment. Research has shown that 
the characteristics considered during the predisposition stage are 
present throughout the entire process. Given that these charac- 
teristics are the most influential indicators of college aspirations, 
search, and choice, identifying how these variables influence col- 
lege application and enrollment becomes a pertinent question 
for investigation. Prior research in the realm of college choice 
has primarily focused on the predisposition and choice phases 
of the process (Gonzalez 8c DesJardins, 2002; Weiler, 1994). 
This study will focus on describing those students engaged in 
the search phase in order to gain increased awareness and com- 
prehension of this understudied phase. 

Although students applying to college have changed quite 
dramatically since the 1980s, theorists and practitioners con- 
tinue to rely on older models of college choice that are in need of 
updating to reflect the current student population (Southerland, 
2006). Certainly many of the same variables included in these 
older models provide a foundation to understand the current 
college choice process (e.g., academic achievement, finances), 
but researchers are also aware that additional factors are more 
or less important to different groups of students such as first- 
generation college students (Cho, Hudley, Lee, Barry, & Kelly, 
2008) or nontraditional or adult students (Southerland, 2006). 
For example, first-generation students, because they tend to 
receive less assistance from parents and counselors in searching 
for colleges, often end up choosing to attend local, public 2-year 
institutions (MacAllum et ah, 2007). Another more recent trend 
likely influencing the college choice process for certain students 
is the establishment of higher education access programs across 
the U.S. There has been a movement for campuses to sponsor 
programs designed to increase college access and enrichment 
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opportunities for historically underserved, economically or edu- 
cationally disadvantaged students (Perna, 2002). These programs 
likely play a strong role as to where the participants choose to 
attend college (Bergin, Cooks, 6c Bergin, 2007). 

Factors Associated With College Choice 
and Application Behaviors 

Academic Achievement. The most apparent influence on college 
choice is academic achievement, which has been positively asso- 
ciated with college enrollment and generally guides the college 
search process (DesJardins, Ahlhurg, 8c McCall, 2006; Manski 
& Wise, 1983). Specifically, researchers found that the best pre- 
dictors of whether a student apphed to any college or university 
were high school grade point average (GPA) and SAT scores 
(Hossler, Braxton, 8c Coopersmith, 1989; Manski 8cWise, 1983). 
This finding can be explained by the fact that most colleges and 
universities utihze measures of high school achievement and 
standardized test scores as a heuristic for selecting students. 
In addition, many colleges and universities advertise the mean 
and/ or range of these measures for their most recently admitted 
freshman class in an effort to advise students on the probability 
of gaining admission based on their academic credentials. 

Gender. Gender and ethnicity both appear to have a role, pri- 
marily mediated by cultural values and SES, in how students 
approach the college application process (Hossler 8c Gallagher, 
1987). A recent survey on college enrollment indicated that 
there are more females in higher education than males (National 
Center for Education Statistics, 2005). However, there are a 
greater proportion of females in 2-year institutions compared 
to females enrolled in 4-year institutions and a greater pro- 
portion of males in 4-year institutions in comparison to males 
enrolled in 2-year institutions (National Center for Education 
Statistics, 2005). 
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Race/Ethnicity and Best Language. Similarly, statistics on enroll- 
ment by race/ ethnicity showed that an overwhelming majority of 
enrollees in both 2- and 4-year institutions were White (National 
Center for Education Statistics, 2005). Asian Americans seem 
to present a unique minority group as their college enrollment 
rates rival those of White students (Goyette, 1999; Michaelides, 
2002). Latino students represent the highest proportion of stu- 
dents not applying to college by the end of 12th grade followed 
by African American students (Hurtado, Kurotsuchi Inkelas, 
Briggs, 8c Rhee, 1997). Asian American students were more 
likely than any other racial/ethnic subgroup to apply to five or 
more colleges, which is indicative of some strategic planning in 
the college choice process (Hurtado et al., 1997). 

Related to race/ethnicity, students’ best language (English 
versus another language) plays a role in their college choice 
process. There has been tremendous growth in the resident 
population in the U.S. over the past 10 years — particularly in 
the South, Southwest, and West (College Board, n.d.). Recent 
immigration has been primarily dominated by individuals from 
Latin American countries (Western Interstate Commission for 
Higher Education, 2008). For students for whom English is 
not their best language, the college search process can be quite 
complex. These students tend to be limited in the colleges they 
consider during the college search process, focusing mostly on 
nearby 2-year institutions (MacAllum et al., 2007). Most high 
schools and colleges are lacking bilingual materials and recruit- 
ers that could help explain and demystify the college application 
process for these students and families (College Board’s Task 
Force on Admissions in the 21st Century, 2008). 

Parental Income and Education Level. Parental income or SES 
seems to be the most cited and influential factor in college 
application and enrollment (Chapman, 1981; Delaney, 1998; 
Ganderton 8c Santos, 1995; Kane 8c Spizman, 1994; Rivkin, 
1995; Somers, Cofer, 8c VanderPutten, 2002; Wilson-Sadberry, 
Winfield, 8c Royster, 1991). Research has shown that students 
with a higher parental income are more likely to attend a 4-year 
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and/or a private institution than students with a lower parental 
income, who are more likely to attend a 2-year, public, and/ or in- 
state institution (Chapman, 1981). A recent study confirmed that 
lower income students were less hkely to apply to more expen- 
sive institutions, clearly limiting the opportunities for higher 
education among this group (Lillis 8c Tian, 2008). Low-SES 
students are not only financially limited in the types of schools 
they can afford to attend (e.g., 2-year vs. 4-year and in-state vs. 
out-of-state), they are also limited in the number of institutions 
they can apply to because of the high cost of apphcation fees. 
In addition, McDonough (1994) wrote that high-SES students 
with moderate academic abihty prepare to get into the “right” 
college using innovative techniques including hiring indepen- 
dent counselors, receiving assistance with essays, or arranging for 
educational experiences or trips over summer breaks that might 
enhance an apphcation. 

Research also shows a positive relationship between the par- 
ents’ education level and the child’s education level (Hurtado et 
ah, 1997; Stage 8cHossler, 1989), as well as students’ educational 
expectations (Goyette, 1999). First-generation college-bound 
students, or those students whose parents have no education 
higher than a high school diploma, tend to be at a distinct disad- 
vantage when it comes to preparing for and applying to college 
(Pascarella, Pierson, Wolniak, ScTerenzini, 2004). They often are 
lacking in knowledge about the application process, costs, and 
other information related to attaining a college degree, including 
necessary high school preparation. 

Location of Residence and High School. Location of home resi- 
dence has also been shown to influence college predisposition 
and choice. Although research has indicated that students who 
reside in an urban setting are more likely to enroll in college 
than students who reside in a rural setting (Dahl, 1982), this 
difference seems to disappear after controlling for SES. Avery 
and Hoxby (2004) noted that low-income students, regardless 
of race or academic achievement, were more likely to respond 
negatively to a college’s distance from home. A report from the 
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National Postsecondary Education Cooperative (MacAllum et 
al., 2007) summarized research on the role of geographic loca- 
tion in college choice, stating that proximity to home was of 
particular importance to African American and Hispanic stu- 
dents who did not want to leave their families and was of lesser 
importance to White students. 

Finally, Litten (1982) noted that high school characteristics, 
student performance, and high school curriculum influence stu- 
dent aspirations to attend a particular institution. Speciflcally, 
social and cultural capital, operationalized by Perna (2000) as high 
school segregation and high school quality, influence the level of 
knowledge and available information on applying to college. 

Extracurricular Participation. School-based extracurricular activ- 
ity participation (e.g., school government, newspaper, perform- 
ing arts, academic clubs) is also associated with 4-year college 
enrollment (Horn, 1997; Perna, 2000), and higher educational 
and occupational aspirations exist for students who participate 
in these activities in comparison to those who do not (Marsh, 
1992). Perna (2000) speculated that this association may be due 
to the increased opportunity for exchange of information related 
to the college apphcation and enrollment process. Participation 
in sports was also positively associated with educational attain- 
ment after high school (Marsh 8c Kleitman, 2002). 

Given the many influences on college apphcation behaviors, 
this study aims to understand whether certain behaviors and char- 
acteristics of students applying to college can be clustered together 
to form meaningful groups of students for further exploration. 


Method 


Participants 

The sample for this study was taken from the College 
Board’s 2006 College Bound Seniors database. This database is 
comprised of the 1,465,744 students who took the SAT or at 
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least one SAT Subject Test and planned to graduate from high 
school in 2006. 

Materials 

Five different databases were merged to conduct this 
research: the 2006 College Bound Seniors database with SAT 
Questionnaire data\ the College Board’s Advanced Placement 
(AP) program participation database, 2005 Quality Education 
Data (QED) National Education Database^, and the 2006 
Annual Survey of Colleges^ database. The variables examined in 
this study can be found in Table 1. 

Design and Procedure 

Cluster analysis seeks to identify homogeneous subgroups 
of cases in a population. The method includes a wide variety of 
procedures that are used to empirically form “clusters” or groups 
of highly similar entities (Aldenderfer & Blashfield, 1984). This 
study employed two-step clustering, which is the clustering proce- 
dure most suitable for large datasets that include both categorical 
and continuous variables. In the first step, cases were assigned to 
“preclusters” to reduce the size of the data matrix. In the second 
step, the preclusters were clustered using the agglomerative hier- 
archical clustering algorithm whereby each precluster began as its 
own cluster, and at successive steps, the preclusters were merged 
until all preclusters were formed into one total cluster. All variables 
were standardized to have the same mean and standard deviation 
to equalize variables measured on different scales. The solutions 
based on a number of different clusters were examined and the 
best solution was chosen based on the ease of interpretation. 

A number of cluster analyses were performed on the data to 
determine the optimum number of clusters. Cluster solutions were 
examined based on different combinations of variables guided by 
Dossier and Gallagher’s (1987) theory of college choice. Many 
of the variables from the 2006 Annual Survey of Colleges (ASC) 
had a large number of missing values (the percentage of miss- 
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Table 1 

Variables for Cluster Analysis 

Variable 

Category Variable Name Variable Levels/Descriptive Statistics 


Academic SAT Composite 

Achievement (Critical Reading and Math) 

HSGPA 

Number of AP exams taken 


Demographic Gender and Minority Status 
Characteristics 


English Best Language 

Parental Socioeconomic Status (SES) 

Income and 

Education 

Level 


First-Generation Student 


Location Percent of HS eligible for free 

of Home lunch 

Residence and 
School 

Size of HS 


Region of country 


Majority ofHS is college-bound 


1027.03, 5D = 200.70 
Min = AOOyMax = 1600 

M = 3.34, 5D = 0.62 
Min = 0.00, Max = 4.30 

0 = None 

1 = 1 or 2 

2 = 3 or more 

1 = Male Minority 

2 = Male Nonminority 

3 = Female Nonminority 

4 = Female Minority 

0 = English is best language 

1 = English is not best language 

1 = Low^ income, low^ education^ 

2 = Low^ income, medium education 

3 = Low^ income, high education 

4 = Medium income, low^ education 

5 = Medium income, medium education 

6 = Medium income, high education 

7 = High income, low^ education 

8 = High income, medium education 

9 = High income, high education 

0 = No (at least one parent has an Associate’s 
Degree or higher) 

1 = Yes (highest level of parental education is 
less than an Associate’s Degree) 

1 = Low (under 20%) 

2 = Moderate (20—30%) 

3 = High (31—50%) 

4 = Very high (greater than 50%) 

1 = Small (less than 500 students) 

2 = Medium (500-999 students) 

3 = Large (1,000 or more students) 

1 = West 

2 = Southwest 

3 = South 

4 = Midwest 

5 = Mid- Atlantic 

6 = New England 

0 = No (less than 50% of HS is 
college-bound) 

1 = Yes (more than 50% of HS is 
college-bound) 
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Variable 

Category 

Variable Name 

Variable Levels/Descriptive Statistics 


Majority of HS is comprised of 
minority students 

0 = Yes (minority students are more than 
50% of student body) 

1 = No (minority students are less than 50% 
of student body) 

Extracurricular 

Participation 

Number of activities the student 
participated in for 2 or more 
years 

M= 1.66, = 1.38 
Min = 0, Max = 5 


Number of types of activities 
the student participated in dur- 
ing HS 

M = 3.35, SD = 2.52 
Min = 0, Max = 10 


Student participated in varsity 
sports for 2 or more years 

0 = No 

1 =Yes 

Institutions of 
Interest 

Number of institutions to which 
student sent scores 

M = 4.12, SD = 3.46 
Min = 0, Max = 30 


Primary selectivity level of insti- 
tutions sent scoresc 

1 = Mostly^ highly selective 

2 = Mostly selective publics 

3 = Mostly selective privates 

4 = Mostly moderately selective publics 

5 = Mostly moderately selective privates 

6 = Mostly nonselective publics 

7 = Mostly nonselective privates 

8 = Mostly 2 -year institutions 


Variability of selectivity level of 
institutions sent scores 

0 = Selectivity does not vary 

1 = Selectivity varies 


Primary type of campus of insti- 
tutions sent scores 

1 = Mostly urban 

2 = Mostly suburban 

3 = Mostly rural 


Variability of type of campus 
sent scores 

0 = Type of campus does not vary 

1 = Type of campus varies 


Primary distance of institutions 
sent scores from student’s home 

0= Mostly in-state or bordering home state 
1 = Mostly out-of-state 


Variability of distance of institu- 
tions sent scores from student’s 
home 

0 = Distance does not vary 

1 = Distance varies 


Note. “Low education is defined by less than a Bachelors degree, medium education is 
defined by a Bachelors degree, and high education is defined by more than a Bachelors 
degree. Low income is defined by a family income of less than $35K, a medium income is 
defined by $35K to SIOOK, and high income is defined l»y more than $100K. **In this table, 
“mostly” refers to the mode among institutions. If there was no mode, then the student was 
«;oded as missing data in this category. ‘^The selectivity of the institutions was determined 
based on data from the 2004—2005 Annual Survey of Colleges (percentage of apphcants 
admitted, mean SAT and ACT scores). In this table, “varies” is defined by the students 
sending their scores to at least two different types within the category. 


672 Journal of Advanced Academics 


Shaw, Kobrin, Packman, & Schmidt 


ing cases ranged from 63% to 96%), thus the cluster solutions 
including these variables involved only a small percentage of the 
cases (students) in the 2006 SAT cohort file. The students with 
valid values on the ASC variables were not representative of the 
college-bound senior population, so the decision was made to 
exclude the ASC variables when deriving the clusters and to use 
these variables to describe the final clusters. 

The final cluster solution was chosen based on both statistical 
criteria and ease of interpretation. The auto -clustering procedure 
in SPSS, using the Schwarz Bayesian Information Criterion 
(BIC), was used to determine the number of clusters. This pro- 
cedure determines the number of clusters in which the BIC is 
small and the change in BIC between adjacent numbers of clus- 
ters is small in comparison to that for all other possible numbers 
of clusters. The auto-clustering procedures chose five clusters for 
the final solution, with a BIC of 7009466.933 and a BIC change 
of -185598.544 between a five- and six-cluster solution. A Hst of 
the final variables used in this solution can be found in Table 1; 
however, as mentioned before, none of the variables describing 
the institutions of interest (ASC variables) were included in the 
cluster analysis. A separate cluster was created to include outlier 
cases that did not fit well into any other cluster. 

There are several possible methods for validating the results 
of a cluster analysis. Techniques considered appropriate for the 
two-step method of clustering include: performing signifi- 
cance tests on variables used to create the clusters, repheating 
the cluster solution on an independent sample, performing sig- 
nificance tests on variables that are not used to create the clus- 
ters, and Monte Carlo procedures to generate an artificial data 
set and compare cluster solutions on the real and artificial data 
(Aldenderfer ScBlashfield, 1984). In this study, where the deci- 
sion to accept or reject a cluster solution was based primarily 
on the face validity of the results, two different methods were 
chosen to validate the cluster solution: performing significance 
tests on variables used to create the clusters (internal vahdation) 
and performing significance tests on variables that are not used 
to create the clusters (external vahdation). For those variables 
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included in the cluster analysis, chi-square tests of association 
were performed for each categorical variable by cluster, and all 
tests were statistically significant at/> < .01. A series of one-way 
analysis of variance (AN OVA) were conducted for each con- 
tinuous variable included in the cluster analysis and all tests were 
statistically significant 'A.t p < .01, which indicated significant 
differences in the means across the clusters. Post-hoc multiple 
comparisons using the Bonferroni correction revealed that all 
pairwise comparisons of clusters were statistically significant for 
all continuous variables ‘iXp < .01. 

Performing an external vahdation of a cluster solution is 
considered to be one of the stronger methods of validation, as 
Aldenderfer and Blashfield (1984) stated that “the value of a 
cluster solution that has successfully passed an external valida- 
tion is much greater than a solution that has not” (p. 66). To 
perform an external validation, the first two authors arrived at a 
priori hypotheses regarding expected differences among clusters 
on the following variables: the number of honors or awards that 
a student received in high school, whether the student was a 
first-generation college student, and the student’s highest edu- 
cational degree goal. The results of these validation analyses 
showed that all a priori hypotheses were supported by AN OVA 
and chi-square results. Specifically, Cluster 1 had significantly 
more honors received than Cluster 5, F{S, 728,003) = 11211.70, 
p < .001. Cluster 2 had significantly more first-generation col- 
lege students than Cluster 4,7^5, 728,003) = 1 7,5 8 8. 93, j?) < .001. 
Cluster 4 had significantly more students desiring the highest 
degree goal level (doctoral) than Cluster 3, (30, N = 688,709) 

= 23145.50, < .001 (Cluster 3 standardized residual = -27.4; 
Cluster 4 standardized residual = 54.1). 


Results 

Table 2 displays the correlation matrix of the variables 
included in the cluster analysis. Tables 3, 4, and 5 display the fre- 
quency distribution and effect sizes of each categorical variable 
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for the five clusters; the numher and percentage of cases in each 
cluster is shown in the header rows of these tables. Table 6 dis- 
plays the means, standard deviations, and effect sizes of the con- 
tinuous variables by cluster. Variable importance plots, or graphs 
that show which variables are more or less important in differen- 
tiating the five clusters, are provided in the Appendix. For each 
cluster, there is one plot with categorical variable information and 
one with continuous variable information. Variables with higher 
chi-square or t values than other variables are more important in 
differentiating a particular cluster from other clusters. 

As a measure of practical significance, effect sizes were com- 
puted for all statistical tests to show the strength of the relation- 
ship or mean differences between variables and clusters (Trusty, 
Thompson, & Petrocelli, 2004). Cohen’s d standardized- differ- 
ence effect size was calculated for each continuous variable and 
Cramer’s V was calculated for categorical variables. Cohen’s d 
was calculated by subtracting the total group mean on a vari- 
able from the cluster mean and then dividing by the total group 
(pooled) standard deviation. This value can be negative or posi- 
tive and a value of zero would indicate no differences between 
groups. Cohen (1988) provided guidance in the interpretation 
of effect sizes by characterizing an effect size of 0.2 as small, an 
effect size of 0.5 as medium, and an effect size of 0.8 as large. In 
this study, Cohen’s d ranged from -0.65 to 0.61, with the smallest 
d value being -0.07. Cramer’s Fis an index of the degree of asso- 
ciation in a contingency table that is larger than 2x2 (Hayes, 
1994). It ranges between 0.00 and 1.00, and the higher the value 
the more strongly related two variables are considered to be. In 
this study, Cramer’s V ranged from 0.07 to 0.91. While there is 
no standard way to interpret Cramer’s V, values closer to 0.00 
indicate a weaker relationship. 

Description of the Clusters 

Effect Sizes. An examination of the standardized differences 
among the continuous variables showed some meaningful dif- 
ferences across the clusters. The largest effect sizes were found in 
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Correlations were rounded to the nearest hundredth and this significant correlation is actually larger than 0.00. 
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Med income, low education 34,799 19.6 26,332 22.7 61,686 38.0 16,966 20.1 45,239 24.9 

Med income, med education 36,417 20.S 11,607 10.0 26,613 16.4 16,726 19.8 33,599 18.5 

Med income, high education 23,782 13.4 6,985 6.0 15,483 9.5 12,497 14.8 19,473 10.7 

High income, low education 6,906 3.9 1,715 1.5 6,055 3.7 2,786 3.3 9,634 5.3 

High income, med education 25,755 14.5 2,208 1.9 4,601 2.8 8,635 10.2 21,829 12.0 


Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 

Privileged High Disadvantaged Average Students Mosdy Female Privileged Low 

Achievers/ Athletes Students Needing More Academics Achievers 

N= 177,287 N= 115,784 Guidance N= 84,306 N= 182,045 

N= 162,198 


DESCRIBING STUDENTS 


=! 

O 


On so 
00 to 
(N 

to SO 
rsj -LO 


00 

SO 
SO CO 


O to 00 to 

Os CN to OO CO 00 O 

- -- 00 SO 


CO 
(N 


CO 


(N 


0 

O 


o to 


CO to (N rsj CO <N p 

O 00 CO On O 00 O 

I-H CO rsj 1 -H o 


rsj r>s so CO 00 o 

O CO 00 CO O rsj 

CO rsj tr^ p^ os^ rsj c^ 

sd rsT 

^ rsj rsj ,-H 


0 

O 


d 


rsj O' 00 00 CO so so 

■rH -rH d 00 K to Os 

CO rsj -rH 


CO to 
rsj 
CO 00 


CO 00 Os 


to 00 i-H to CO 

O O' 00 i-H so to Os 

00 00 rsj rsj to o 


0 

U 


rsj p to 
n-h CO to 

rsj CO rsj 


O CO so 


i-H to 00 
00 so CO 
rsj^ On^ tr^ 
O trT os" 
rsj i-H 


O' rsj so rsj Tj- CO 
o r-s n-h so o o CO 
CO 00 o o 


0 

o 


to to 00 to rsj rsj rsj 


00 O' o 

CO so 00 
O' CO Os 


O' CO so 00 00 
CO CO 00 to 


S 

b- 

u 


• ^ 
•+>» 


o 

O 

a 


be 

£ 


2 


H 


OJ W 




o ^ 

OJ 

Q 




< M S 


o 

Q 




V 

a. 


678 Journal of Advanced Academics 


172,093 97.1 26,968 23.3 54,884 33.8 


Table 4 

Frequencies and Effect Sizes of Categorical FHigh School-Level Variables by Cluster 
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Varies 52,765 46.5 38,946 56.9 42,014 48.7 25,312 45.3 46,928 47.4 

Does not vary 60,592 53.5 29,500 43.1 44,234 51.3 30,533 54.7 52,030 52.6 
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SAT scores, with Cluster 2 showing significantly lower perfor- 
mance than the other clusters (d = -0.65) and Cluster 4 show- 
ing significantly higher performance {d = 0.61). An examination 
of effect sizes for the categorical variables across clusters using 
Cramer’s V showed that there were a number of major differ- 
ences among these groups. The largest effect sizes were found 
in the following variables: Majority of Students in High School 
is Racial/Ethnic Minority {V = 0.91, p < .001), Participated 
in Varsity Sports for 2+ Years {V = 0.78, < .001), Percent of 
Students in High School Eligible for Free Lunch (E = 0.51, 

< .001), Majority of Students in High School is Considered 
College Bound (E= 0.44, < .001), Number AP Exams Taken 

{y = 0.40,/> < .001), First-Generation College Student {V = 0.33, 
p < .001), and Gender by Minority Status (E= 0.33,/> < .001). 

Cluster Names. All clusters were given descriptive labels based on 
the results of the analyses. Cluster 1 was labeled Privileged High 
Achievers/ Athletes. Nearly half of the students in this cluster 
(49%) were male nonminority students who attended affluent 
high schools with other mostly nonminority students. Many of 
the students in this cluster resided in the Midwest, Mid-Atlantic, 
or New England. Students in this cluster sent their SAT scores to 
the highest number of institutions compared to the other clusters. 
This cluster also had the largest percentage of students partici- 
pating in varsity sports for 2 or more years (97%), as well as the 
smallest percentage of first-generation college students (19%). 

Cluster 2 was labeled Disadvantaged Students. Nearly three 
quarters of the students in this cluster were minority students 
(29% male minority and 46% female minority) who attended 
mostly large high schools that had a large proportion of minor- 
ity students and students eligible for free lunch. This cluster also 
had the highest percentage of students reporting that their best 
language was something other than English, as compared to the 
other clusters (5%). These students tended to come from families 
with low income and low parental education, and many resided 
in the West, Southwest, and South. Fifty-five percent of the stu- 
dents in this cluster attended high schools where the majority 
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of students were college-bound. The students in this cluster had 
one of the lowest mean high school GPAs {M = 3.18, SD = .66, 
d = -0.25) and the lowest mean combined SAT score (M = 897, 
SD = 189, d = -0.65). This cluster also had the highest percentage 
of students (83%) among the five clusters who indicated a plan 
to earn a bachelor’s, master’s, or doctoral degree as well as the 
highest percentage of students considered to be first-generation 
college students (62%). 

Cluster 3 was labeled Average Students Needing More 
Guidance. This cluster was comprised of mostly nonminor- 
ity students (42% male and 37% female) who largely attended 
high schools with similar racial/ethnic compositions in the West, 
South, or Mid- Atlantic. Many of the students in this cluster (53%) 
attended high schools where the majority was not college-bound. 
Half of the students in Cluster 3 are considered first-generation 
college students. Cluster 3 also had the second highest percentage 
of students reporting their parents’ income and education level to 
both be low (22%). The average high school GPA for this cluster 
(M = 3.29) was the median among the five clusters, and the aver- 
age combined SAT critical reading and math score {M = 984) 
was only higher than that of Cluster 2. The effect size of the SAT 
for Cluster 3 was -0.22 compared to the total group including all 
clusters. Cluster 3 was also the median among the five clusters in 
terms of the average number of extracurricular activity types for 
2 or more years as well as the average number of extracurricular 
activity categories overall during high school. 

Cluster 4 was labeled Mostly Female Academics. Nearly 
77% of the students in this cluster were female, nonminority 
students. The students in this cluster also largely attended high 
schools located in the West, South, or Mid-Atlantic with mostly 
nonminority students. The students in this cluster tended to take 
the most AP exams and did not participate in varsity sports. 
These students had the highest mean high school GPA among 
the clusters, with the smallest standard deviation {M = 3.69, SD 
= 0.46, d = 0.56), as well as the highest average combined SAT 
critical reading and math scores with the smallest variability {M 
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= 1150, SD = 164, d = 0.61). Most of these students (75%) were 
not first-generation college students. 

Cluster 5 was labeled Privileged Low Achievers. This cluster 
was comprised of mostly nonminority students (47% male and 
37% female) who attended wealthy high schools with mostly 
nonminority students. This cluster had the largest percentage of 
cases across clusters attending schools where the majority of stu- 
dents were college-bound (94%). Yet, this cluster also had the 
highest percentage of students who did not take any AP exams 
(88%) and the highest percentage that did not send their SAT 
scores to any colleges (27%). In addition, the students in this 
cluster had the lowest mean high school GPA among the clus- 
ters (M = 3.17, SD = .61, d = -0.27), and participated in the 
least number of extracurricular activities for 2 or more years {^d 
= -0.28), as well as the least number of extracurricular activi- 
ties overall during high school {^d = -0.27). This group had the 
smallest percentage among the five clusters (76%) indicating 
their aspirations to earn a bachelor’s, master’s, or doctoral degree. 
Approximately one third of Cluster 5 students are considered 
first-generation college students. 

SAT Score-SendingBehaviors. Cluster 1 had the most variabihty in 
the type of campus to which students sent their SAT scores. Most 
of the students in all five clusters sent their SAT scores to only 
one type of campus setting (rural, urban, or suburban). However, 
approximately 17% of the students in Cluster 1 sent their scores 
to a variety of campus settings compared to 12-14.5% of stu- 
dents in the other clusters. Clusters 1 and 4 showed the largest 
percentage of students sending their SAT scores to institutions 
out of their home state (22% and 20% respectively), compared 
to 9.5-13% of students in the other clusters. Cluster 2 had the 
largest percentage of students (91%) sending their SAT scores 
to institutions mostly in state or in a state bordering their state. 

Cluster 2 showed the most individual variability among the 
clusters in the types of institutions students chose with regard 
to control (public or private) and selectivity. Approximately 
57% of the students in Cluster 2 sent their scores to a variety 
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of types of institutions compared to 45-49% of students in the 
other clusters. Students in Cluster 2 also tended to send their 
scores to more moderately selective public institutions. Cluster 
3 had the largest percentage among the clusters of students who 
sent scores to nonselective public institutions or 2-year colleges 
(29%). Cluster 4 students were the most likely to send their 
scores to similar types of institutions in terms of control and 
selectivity, and many of these students sent their scores to highly 
selective or moderately selective public institutions. 

There was virtually no variation between the clusters in 
whether or not the institutions to which students sent scores 
varied in distance from the students’ homes. With regard to 
campus environment, the students in Cluster 2 tended to prefer 
mostly urban settings (55%), while the majority of students in 
Clusters 1, 4, and 5 sent their scores to mostly suburban cam- 
puses. Cluster 3 showed the most variation in the type of campus 
environment and had the largest percentage of students among 
the clusters (18%) who sent their scores to mostly rural colleges. 


Discussion 

College applicants enter the college admission process with 
distinct characteristics and college aspirations. As such, exam- 
ining the different combinations of these variables to cluster 
similar students may be a useful way to understand the popula- 
tion of college-bound students and serve their unique needs. The 
cluster analysis in this study demonstrated that different pat- 
terns of characteristics and behaviors were associated with dif- 
ferent clusters of students. Five unique clusters of students were 
identified: Privileged High Achievers/ Athletes, Disadvantaged 
Students, Average Students Needing More Guidance, Mostly 
Female Academics, and Privileged Low Achievers. Given that 
the media has focused its attention on the very high achiev- 
ing students that apply to 20 or more schools with stellar test 
scores and the students who are ill-prepared to take on academic 
challenges of college (e.g., Pappano, 2007), one would expect 
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to find two clusters of college applicants — the Privileged High 
Achievers/ Athletes (Cluster 1) and the Disadvantaged Students 
(Cluster 2). However, this study found five. In particular, there 
is little mention in the media or elsewhere regarding students in 
the Average Students Needing More Guidance (Cluster 3) or 
the Privileged Low Achievers (Cluster 5) clusters. 

The Privileged High Achievers/ Athletes cluster (Cluster 
1) was comprised of many nonminority males attending afflu- 
ent high schools. The students in this cluster are similar to the 
“all-around” students we hear so much about. They have many 
resources at their fingertips to navigate the college search and 
choice process. Given that this cluster has the lowest percentage 
of first-generation college-hound students, it is likely that they 
are able to rely on their families for information about applying 
to and attending college. 

The Disadvantaged Students cluster (Cluster 2) was com- 
prised of students who appeared to be lacking in financial, social, 
and educational resources and were lagging in academic quahfi- 
cations as they embarked on the college search and choice pro- 
cess. Not surprisingly, this cluster had the largest percentage of 
first-generation college-bound students as well as those report- 
ing their best language to be other than EngUsh. 

The Average Students Needing More Guidance cluster 
(Cluster 3) was comprised of students that would seem to benefit 
from the knowledge of guidance counselors with regard to navi- 
gating the college search, choice, and financial planning process. 
Given that this cluster had the median high school GPA, it is 
surprising that, on average, the slight majority of the students in 
the high schools they attended were not college bound. However, 
the large percentage of first-generation college-bound students 
in this cluster may explain this finding. Also, the students in this 
cluster were involved in their schools via extracurricular activity 
participation, which signified an interest in contributing to their 
high school and interacting with the community — and is known 
to be related to a host of beneficial academic and nonacademic 
outcomes during and after high school (Marsh & Kleitman, 
2002 ). 
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The Mostly Female Academics cluster (Cluster 4) resembled 
the Privileged High Achievers/ Athletes (Cluster 1) except that 
the students in the Mostly Female Academics cluster did not 
participate in sports and had a smaller percentage of students 
reporting high parental income. This cluster was also the most 
likely to show little variation in the types of schools to which 
they sent their scores (with regard to control and selectivity) and 
many of them sent scores to highly selective or moderately selec- 
tive public institutions. 

The Privileged Low Achievers cluster (Cluster 5) was the 
largest cluster among the five clusters. This was somewhat sur- 
prising, as it seems these were students who had an excellent 
foundation for success in high school, yet they fell short aca- 
demically and with regard to aspirations for higher education. 

We seek to cluster or classify students into like groups in 
order to better understand their differences and similarities. 
Although there were similarities between clusters, such as the 
Privileged High Achievers/ Athletes (Cluster 1) and Privileged 
Low Achievers (Cluster 5) with their similar proportions of 
minority students and students in higher SES categories, the 
clusters clearly differed on the basis of academic performance. 
Thus, merely classifying students by gender and minority status 
or SES would have lumped students into groups with discrepant 
achievement patterns and motivations — clearly important parts 
of the college search and choice process. This exemplifies the 
utility in exploring these data with cluster analysis techniques. 


Implications and Directions 
for Fntnre Research 

It is likely that these clusters have implications for admis- 
sion directors, institutional researchers, enrollment managers, 
and school counselors, as well as the popular media, in develop- 
ing and evaluating interventions and outreach efforts that aid 
students in the college preparation, search, and choice processes. 
First, it would be useful to test hypotheses related to tailoring 
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such efforts for the different subgroups. The Average Students 
Needing More Guidance (Cluster 3) may have been more suc- 
cessful in college with greater access to educational resources, 
and in particular, college preparatory classes and college applica- 
tion, search, and choice direction from parents and educators. 
This cluster had the second highest percentage of first-gener- 
ation college-bound students (50%), so it is hkely that many of 
the parents of the students in this cluster are not fully aware of 
the application and enrollment process and, thus, cannot pro- 
vide the necessary support to their children. The educational 
system may be failing these students, and that fact is going unde- 
tected. Middle schools guidance counselors can play a helpful 
role with this group by having students, as well as their parents, 
think about the college process early. Curricula, often available 
in English and Spanish, exist to help students, along with their 
parents, prepare for college starting in the seventh grade through 
counselor-led lessons in study skills, coursework planning, the 
college apphcation process, and financial aid. 

Although we may have expected to find the Disadvantaged 
Students cluster (Cluster 2) in the data based on national demo- 
graphic information (e.g., MacAllum et ah, 2007), as well as 
the popular media (e.g., Schworm, 2008), this does not mean 
that this group of students and their unique challenges should 
be overlooked or ignored. New and innovative efforts must be 
directed at this continuously underserved group. This cluster 
deserves special attention and deeper exploration in the realm of 
educational interventions that can aid in the academic prepara- 
tion of these students and start them on an early and smooth 
path toward higher education. One example of a promising 
innovative effort aimed at these students is the Posse Program 
(Glater, 2006), where small, diverse groups of inner-city students 
are selected and trained to attend a selective college/university as 
a group to provide support to each other and are graduating at 
a rate of 90%. 

The Privileged Low Achievers (Cluster 5) seem to be the 
students who are equipped with the resources to succeed; how- 
ever, they do not take advantage of these resources for various 


Volume 20 Number 4 Summer 2009 689 


DESCRIBING STUDENTS 


reasons. Investigations into their academic motivation or expec- 
tations may be worthy of further study. It is possible that these 
students may benefit from taking a year between high school 
and college to participate in a community service program, for 
example, in order to grow and mature and determine what they 
might want to focus on academically. 

Future research should attempt to replicate the analyses in 
this study with different student cohorts to determine the stabil- 
ity of the findings. It would also be useful to arrive at hypoth- 
eses regarding cluster-tailored outreach strategies related to the 
college application and choice process. A follow-up qualitative 
investigation, potentially with focus groups of students based on 
cluster membership, may be worthwhile to address the specific 
needs and issues that these groups of students are confronting. 
For example, the Mostly Female Academics (Cluster 4) may 
benefit from learning techniques to handle the academic pres- 
sures they face throughout the college apphcation process, while 
the Average Students in Need of Guidance (Cluster 3) may ben- 
efit from greater direct outreach from colleges and universities — 
whether this entails meetings with admission counselors or more 
personal invitations to open houses. 


Limitations 

There are a number of limitations of this study that are 
related to the nature of the sample. All students in the data- 
base examined took the SAT and their institutions of interest 
were determined from the colleges and universities that received 
their scores. As such, students who applied to schools that did 
not require test scores or accepted unofficial score reports would 
not be captured in our database. Additionally, sending scores to 
a particular institution does not guarantee that an application 
will also be sent to that school. Thus, the database used in this 
study did not include information on where students ultimately 
apphed or enrolled, but rather all of the institutions they consid- 
ered seriously enough to send their scores. 
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Conclusion 

Those involved in enrollment management are well aware of 
the changing nature of college admissions and the students who 
will be “knocking on the door” (Western Interstate Commission 
for Higher Education, 2008). Colleges and universities can- 
not continue to search and recruit the same types of students 
they have been recruiting in the past. Population and demo- 
graphic shifts in the United States present new landscapes and 
challenges for those in higher education. This study provides a 
unique glimpse into who is currently applying to college with 
data that are not often jointly considered, including students’ 
demographic, geographic, high school, aspirational, and target 
college characteristics. Ongoing examinations are necessary as 
the importance of understanding applicants from aU angles can- 
not be underestimated. 
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End Notes 

1 This database is comprised of the students who participated 
in the SAT program and reported to graduate from high school 
in 2006 and includes student responses to the questionnaire on 
their demographic and academic background, as well as higher 
education preferences, completed at the time of SAT registration. 

2 The QED database includes public and private school data 
from the Common Core of Data from the National Center for 
Education Statistics (e.g., percent eligible for free lunch, high 
school size, etc.). 

3 The Annual Survey of Colleges is a yearly College Board 
survey of colleges, universities, vocational/technical, and gradu- 
ate schools, the objective of which is to obtain information that 
is important for potential students. It covers both the informa- 
tion that potential students want to know and the information 
that the institutions feel that students should know, such as 
admission requirements, the number of applicants and admitted 
students, the most popular majors on campus, etc. 
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Appendix 

Variable Importanee Plots 


Cluster 1 : Privileged High Achievers/Athletes 
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Cluster 2: Disadvantaged Students 


Minority x Gender Status 
# of Varsity Sports (categor.) 
# of AP Exams (categor.) 
English not Best Language 
# of Colleges Sent Scores (categor.) 

HS Size (categor.) 
Majority HS College Bound 
% HS - Free Lunch 
SES 

HS Mostly Minority 



0 100,000 200,000 300,000 400,000 500,000 

Chi-Square 


# of Activities 


SATV+ M 


HSGPA 


# of Extracurricular Categories 



-250 -200 -150 -100 -50 0 50 

Student's t 


100 150 200 250 


Volume 20 Number 4 Summer 2009 697 


DESCRIBING STUDENTS 


Cluster 3: Average Students Needing More Guidance 
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Cluster 4: Mostly Female Academics 
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Cluster 5: Privileged Low Achievers 
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